Faster Online Matrix-Vector Multiplication
نویسندگان
چکیده
We consider the Online Boolean Matrix-Vector Multiplication (OMV) problem studied by Henzinger et al. [STOC’15]: given an n× n Boolean matrix M, we receive n Boolean vectors v1, . . . ,vn one at a time, and are required to output Mvi (over the Boolean semiring) before seeing the vector vi+1, for all i. Previous known algorithms for this problem are combinatorial, running in O(n3/ log2 n) time. Henzinger et al. conjecture there is no O(n3−ε) time algorithm for OMV, for all ε > 0; their OMV conjecture is shown to imply strong hardness results for many basic dynamic problems. We give a substantially faster method for computing OMV, running in n3/2Ω( √ logn) randomized time. In fact, after seeing 2ω( √ logn) vectors, we already achieve n2/2Ω( √ logn) amortized time for matrix-vector multiplication. Our approach gives a way to reduce matrix-vector multiplication to solving a version of the Orthogonal Vectors problem, which in turn reduces to “small” algebraic matrix-matrix multiplication. Applications include faster independent set detection, partial match retrieval, and 2-CNF evaluation. We also show how a modification of our method gives a cell probe data structure for OMV with worst case O(n7/4/ √ w) time per query vector, where w is the word size. This result rules out an unconditional proof of the OMV conjecture using purely information-theoretic arguments. ∗Department of Computer Science, Aarhus University, [email protected]. Supported by Center for Massive Data Algorithmics, a Center of the Danish National Research Foundation, grant DNRF84, a Villum Young Investigator Grant and an AUFF Starting Grant. †Computer Science Department, Stanford University, [email protected]. Supported in part by NSF CCF-1212372 and CCF-1552651 (CAREER). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
منابع مشابه
Decoding Hidden Markov Models Faster Than Viterbi Via Online Matrix-Vector (max, +)-Multiplication
In this paper, we present a novel algorithm for the maximum a posteriori decoding (MAPD) of timehomogeneous Hidden Markov Models (HMM), improving the worst-case running time of the classical Viterbi algorithm by a logarithmic factor. In our approach, we interpret the Viterbi algorithm as a repeated computation of matrix-vector (max,+)multiplications. On time-homogeneous HMMs, this computation i...
متن کاملImplementing Blocked Sparse Matrix-Vector Multiplication on NVIDIA GPUs
We discuss implementing blocked sparse matrix-vector multiplication for NVIDIA GPUs. We outline an algorithm and various optimizations, and identify potential future improvements and challenging tasks. In comparison with previously published implementation, our implementation is faster on matrices having many high fill-ratio blocks but slower on matrices with low number of non-zero elements per...
متن کاملAutotuning Divide-and-Conquer Matrix-Vector Multiplication
Divide and conquer is an important concept in computer science. It is used ubiquitously to simplify and speed up programs. However, it needs to be optimized, with respect to parameter settings for example, in order to achieve the best performance. The problem boils down to searching for the best implementation choice on a given set of requirements, such as which machine the program is running o...
متن کاملA General Graph Model for Representing Exact Communication Volume in Parallel Sparse Matrix-Vector Multiplication
In this paper, we present a new graph model of sparse matrix decomposition for parallel sparse matrix–vector multiplication. Our model differs from previous graph-based approaches in two main respects. Firstly, our model is based on edge colouring rather than vertex partitioning. Secondly, our model is able to correctly quantify and minimise the total communication volume of the parallel sparse...
متن کاملSegmented Operations for Sparse Matrix Computation on Vector Multiprocessors
In this paper we present a new technique for sparse matrix multiplication on vector multiprocessors based on the efficient implementation of a segmented sum operation. We describe how the segmented sum can be implemented on vector multiprocessors such that it both fully vectorizes within each processor and parallelizes across processors. Because of our method’s insensitivity to relative row siz...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017